High quality video in high dynamic range scenes from interlaced dual-ISO footage
نویسندگان
چکیده
In this paper we present a simple and affordable method to generate high quality video from a high dynamic range scene. It is performed without utilizing extra lighting, neither alternating exposures, nor operating with dual-camera set-ups. Our input is an interlaced video alternating row pairs with different ISO values, as some DSLR camera models can provide. The proposed algorithm involves two main steps: first the computation of two single-ISO full-frame images (one for each ISO value) using an inpainting-based deinterlacing method, followed by their combination into a single frame by a weighted average. This results in a high dynamic range frame, containing all the details in bright and dark areas at the same time, that is finally tone-mapped into a low dynamic range frame for display purposes. Current results show this is a practical and cost-effective method that produces outputs free of ghosting artifacts and with very little noise. Introduction The human visual system is able to adjust to world scenes where the light intensity values cover a very wide range, capturing details in dark and bright areas simultaneously. Dynamic range is defined as the ratio between the brightest and the darkest intensity levels, and while in common situations the light coming from a scene is of high dynamic range (HDR), the vast majority of camera sensors (and displays) are of low dynamic range (LDR). There is a vast literature on methods for creating HDR images using regular, LDR sensors, that started with the seminal approaches of [17, 7]; in those works, several LDR pictures of the same HDR scene are taken with varying exposure time, so that the short exposures capture details in the bright regions, long exposures capture details in the dark regions, and finally all these images are combined into a single, HDR image with overall detail visibility. In order to be able to show this image on a regular, LDR display, it has to be transformed through a process called tone-mapping that compresses the dynamic range of the image while trying to maintain its details and natural appearance. In the movie industry there is a growing interest in HDR imaging, but the challenge of shooting HDR scenes using LDR equipment exists since the inception of cinema: the way to address it is to add artificial lights, whose effect is to raise the intensity levels of the darkest parts of the image hence reducing the dynamic range of the scene, fitting it into the reduced range of the capture medium (film or digital). This is a cumbersome, expensive procedure requiring very significant human and material resources that greatly affect the cost of the production. Some alternatives exist but they are not fully practical: some digital cinema camera models are able to alternate exposure times on consecutive frames, creating pairs of different-exposure images that are then fused following the approach of [7], but camera and/or object motion produces ghosting artifacts on the fusion results. A recent possibility is to use a dual-camera set-up [9], with two synchronized, perfectly registered cameras on an orthogonal rig so that a semi-transparent mirror sends most of the light intensity to one of the cameras, and the rest to the other camera. These images can be fused without problem because they are fully aligned, so there is no risk of ghosting, but the dual-camera process has limitations: cost and practicality considerations stemming from the use of two cameras, image problems caused by imperfections in the mirror, the need to perform tone-mapping to the output. For more details we refer the reader to [3]. More recently, we find a very small number of works that perform HDR reconstruction from a single interlaced image. Gu et al. [11] combine rows taken with different exposures times, and since the rows are not captured simultaneously this method produces ghosting artifacts as well, which need to be reduced by estimating and compensating for the motion-blur. The camera software Magic Lantern (ML) [16] allows some camera models to capture image/video with dual-ISO values that alternate between consecutive image line pairs. It provides an implementation to interpolate a full-frame low-ISO image, containing less noise on shadow areas. It does not claim to compute an HDR image, though the final picture is the result of combining the information from both the low-ISO and the high-ISO full-frame images. The method follows a chain of steps: separate the two ISO frames, interpolate the missing lines to get the full images, and combine information from both interpolated frames to highly reduce the noise in dark regions. Hajisharif et al. [12] perform at the same time demosaicing, denoising, re-sampling and HDRreconstruction, starting from the interlaced input provided by the ML software [16]. This method requires a previous radiometric calibration process, therefore it cannot be used when the camera is not available. A similar idea was developed by Heide et al. [13], who propose a single optimization step using image priors and regularizers of the different stages going on in the camera color pipeline (denoising, demosaicing, etc). The selection of image priors is crucial for the optimization process, and the values are highly dependent on the set of images selected for learning the best weights. These latter two methods have the advantage of working directly with the RAW data without following a staged pipeline, therefore no cumulative errors are carried out from one process to the next; nevertheless, this integration makes it difficult to further extend the processes involved, since they are not independent, plus it’s also challenging to locate and rectify errors in the pipeline. Our main contribution in this work is to propose a simple and effective method to shoot high quality video in HDR scenarFigure 1. Schematic of the proposed method. First the dual-ISO input is split into two half-size images Il/2 and Ih/2, each one with the rows corresponding to a single ISO value. Next a deinterlacing method is used to generate full-frame images Il and Ih from Il/2 and Ih/2 respectively. These full-frame images are linearly combined and tone-mapped to produce the final output. ios using a single camera capable of recording interlaced dual-ISO footage. The interlaced input guarantees that the result will be free of ghosting artifacts, because the low and the high ISO lines are recorded simultaneously. No access to the camera used to record the input is required. Our implementation pipeline incorporates a set of stages: calculating the full-frame single-ISO images applying a deinterlacing algorithm, combining these full-frame images into a single HDR picture, and finally applying a tone-mapping operator to produce a LDR output. For these stages we adapt to our setting state-of-the-art algorithms which produce high-quality results. Tests and comparisons show that our method outperforms other approaches both quantitatively (in terms of PSNR) and qualitatively (no spurious colors, better edge preservation, less noise). Methodology The input to our algorithm will be a video sequence in RAW format, where each frame alternates row pairs with different ISO values, and the output will be a LDR video sequence with simultaneous detail visibility in the dark and bright zones of the picture, see scheme in Figure 1. All the stages of our method are applied on a frame-by-frame basis except for the final tone-mapping, which imposes temporal consistency on the output by considering several input frames at the same time. Let’s describe our proposed method in detail. Generation of single-ISO full-frame images The first stage consists in the computation of two singleISO full-frame images from an input dual-ISO frame, using an inpainting-based deinterlacing method. In order to obtain both these full-frame pictures Il (for the low ISO value) and Ih (for the high ISO value) we proceed as follows: • Split the dual-ISO input into two half-size images Il/2 and Ih/2, each one with the rows corresponding to a single ISO value, see Figure 2. Figure 2. The dual-ISO input frame (left), which is split into two half-size images Il/2 and Ih/2 (right), each one with the rows corresponding to a single ISO value. • Using an adapted version of the inpainting-based deinterlacing method [2], generate full-frame images Il and Ih from Il/2 and Ih/2 respectively. • Perform demosaicing using [20], and apply a refinement step to improve the interpolated results, see Figure 3. Figure 3. Generated full-frame images Il (left) and Ih (right). Row interpolation by deinterlacing In order to interpolate the missing rows and generate Il from Il/2 and Ih from Ih/2, we adapt the deinterlacing method of Ballester et al. [2]. This is a state-of-the-art technique that follows the dense stereo matching approach of Cox et al. [5] and fills-in a missing line L0 between two given lines L− and L+ by (see Fig. 4): • First, performing a global matching between lines L− and L+ by computing the correlation matrix between their image values and finding the matches (optimal path) through dynamic programming. In practice only a band around the diagonal is considered for the search for the optimal path, and the matches are estimated assuming a certain noise variance for the image values. • Second, each matching pair of pixels determines a segment that crosses the missing line L0 at a pixel location that is filled-in with the average of the matching pair. For those points that were not matched, their values are computed by bilinear interpolation from the neighboring correspondences. The images Il/2 and Ih are color filter array (CFA) RAW pictures with the 2× 2 Bayer pattern ‘RGGB’; with the R and B values we create half-size channels that can be deinterlaced directly with [2], while for the G channel we first demosaic the values using [20] and then apply the deinterlacing method [2] but with the modification of filling-in two consecutive lines simultaneously given the upper and lower neighboring lines of the pair; see Figure 5. Figure 4. Inpainting-based deinterlacing algorithm. Left: global matching between two given lines L− and L+. Right: each matching pair of pixels determines a segment that crosses the missing line L0 at a pixel location that is filled-in with the average of the matching pair. Figure adapted from
منابع مشابه
Enhancement of Low Dynamic Range Videos using High Dynamic Range Backgrounds
In this paper, we present a practical system for enhancing the quality of Low Dynamic Range (LDR) videos using High Dynamic Range (HDR) background images. Our technique relies on the assumption that the HDR information is static in the video footage. This assumption can be valid in many scenarios where moving subjects are the main focus of the footage and do not have to interact with moving lig...
متن کاملEvaluation of a high dynamic range video camera with non-regular sensor
Although there is steady progress in sensor technology, imaging with a high dynamic range (HDR) is still difficult for motion imaging with high image quality. This paper presents our new approach for video acquisition with high dynamic range. The principle is based on optical attenuation of some of the pixels of an existing image sensor. This well-known method traditionally trades spatial resol...
متن کاملFast Incident Light Field Acquisition and Rendering
We present a practical and inexpensive approach for the acquisition and rendering of static incident light fields. Incident light fields can be used for lighting virtual scenes or to insert virtual objects into real world video footage. The virtual objects are correctly lit and cast shadows in the same way as real objects in the scene. We propose to use an inexpensive planar mirror and a high d...
متن کاملObjective and subjective evaluation of High Dynamic Range video compression
A number of High Dynamic Range (HDR) video compression algorithms proposed to date have either been developed in isolation or only-partially compared with each other. Previous evaluations were conducted using quality assessment error metrics, which for the most part were developed for qualitative assessment of Low Dynamic Range (LDR) videos. This paper presents a comprehensive objective and sub...
متن کاملAuthor's Personal Copy Assessment of Video Tone-mapping: Are Cameras' S-shaped Tone-curves Good Enough?
The performance of video tone-mapping operators is investigated in a rating experiment using two criteria: overall quality and fidelity to real-world experience. The study includes a tone-curve used in commercial cameras, rarely considered in tone-mapping evaluation studies. The quality is measured for a range of parameter settings, revealing the importance of parameter fine-tuning and often un...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016